DOCUMENT RESUME 

TM 002 568 

Massachusetts Fourth Grade Testing Program 1971 • 
Massachusetts Stane Dept* of Education, Boston. 
Test-Bull-1 
Apr 71 

4p- , 
MF-$0.65 HC-$3,29 

♦Achievement Tests; Aptitude Tests; Basic Skills; 
♦Grade 4; *Measurement Instruments; State Programs; 
Statistical Analysis; *Testing Programs; *Test 
Results 

♦Massachusetts 



The testing of every fourth-grade classroom in 
Massachusetts was carried out in an effort to answer the following 
questions: (1) What are the levels of mastery of basic skills in 
Massachusetts fourth grades? Are there differences in achievement 
between skills?; (2) What educational needs can be inferred for 
Massachusetts • students^ based on basic skills testing?; (3) Do 
testing data reveal the influence of Federal programs?; (4) Does the 
product of education vary according to available resources — financial 
outlay, professional support, materials?; and (5) Are there regional 
variations in abilities and achievement? Aptitude and achievement 
data were obtained for 324 school systems, 1U88 schools, and 85^382 
fourth-grade children. The test instruments used were the 
Comprehensive Tests of Basic skills and the Short Form Test of 
Academic Aptitude published by CTB/McGraw-Hill . Three different 
reports of the test data were supplied to all school systems. The 
test data showed that the state as a whole exceeded the national 
norms; the ^ean "obtained" scores were significantly higher than the 
"anticipated" scores in all areas measured by the tests of basic 
skills • Highest scores were in reading comprehension and the lowest 
in arithmetic. From the test results, it was concluded that 
Massachusetts fourth graders are slightly higher than the national 
norms in all areas measured. Correlations between the subtest total 
mean scores by school are statistically significant and very high. 
Schools that did well on one subtest generally did well on all 
subtests, A survey of school superintendents showed that 98% used the 
test data. (DB) 



'.ED 075 489 
TITLE 

II^STITUTION 
REPORT NO 
PUB CATE 
NOTE 

EDRS PRICE 
DESCRIPTORS 



IDENTIFIERS 
ABSTRACT 



EKLC 



us DEPARTMENT OF HEALTH. 
EDUCATION & WELPABE 
OFFICE OF EDUCATION 

THIS DOCUMENT HAS BEEN REPRO 
DUCEO EXACTLr AS RECEIVED FFlOM 
THE PERSON OR ORGANIZATION ORIG 
t.'JATiNG IT POINTS OF VIEW OR OPIN 
IONS STATED DO NOT NgCESSARltV 
REPRESENT OFFICIAL OFFICE OF EDU 



April, 1971 



' CO 




CATION POSITION CR POLICY 



Massachusetts Fourth Grade Testing Program 1971 



Test Bulletin ifl 

In Auqust 1970, Commissioner of Educatlfvi Nell Sullivan announced plans 
to test all Massachusetts fourth araders during January 1971. The department 
had been movlnq toward this since early 1970, when the Bureau of Curriculum 
Innovation assembled a committee to Investigate methods of Needs Assessment, 
as related to the Joint evaluative concerns of both Title I and Title III, 
In June 1970, another qroup was organized to rswrit© state NDEA-VA Into ESEA 
Title !IL 

Areas explored by this qroup Included procedures for determining educational 
or Instructional objectives, Including Massachusetts' Involvement with the 
Instructional Objectives Exchanqe at UCLA; adopt I nq a standardized testing 
proqram; as well as alternatives to a standardized testlna proqram. These latter 
alternatives Involved local selection of If^s'rruct lonal objectives either from 

J a prepared list of objectives or from locaf specification of objectives teamed 

* with some form of evaluation, 

/^^-^ Techniques varied from askinq a test publisher such as Educational Testtnq 

^ Service to prepare a series of comparable instruments from their Item bank by 

usinq Item samplinq techniques, to workinq with Project Comprehensive Achievement 

^^^^ Monltorinq to tailor assessment to the Individual school system. Analyzinq the 
results of each system's testinq proqram was also explored. There appears to be 
at present no direct way to compare the results of the several testinq proqrams. 
The standardization and norms vary for each of the testinq proqrams. The time of 
testinq In Massachusetts 's schools varies as well. Some systems conduct limited 
testinq while others are extensive. There Is neither overlap nor comparability 

I^Mf among local testinq proqrc^ms. 



. The key. questions to be answered by the tostlnn proqram were: 

fli What are the levels of mastery of basic skills In Massachusetts 

fourth grades? Are there dif fer^-ances In achievement between skills? 

• What educational needs can be inferred for Massachusetts' students, 
based on basic skills testinq? 

• Do testinq data reveal the Influence of Federal proqrams? 

O Does the product of education vary accordlnq to available 

resources -* financial outlay, professional support^ materials? 

• Are there re:4'onal variations In abilities and ach leverviont? 



ERLC 



One of the greatest problems with a program of this magnitude, Involving 
evarv fourth grade classroom !n Massachusetts, was the communication barrier 
between the State Department and Individual teachers. In order to Inform all 



.2- 



Concerned, meetings were planned with superintendents for their regular round- 
table meetings In September and October. Pretest workshops for system test 
coordinators were scheduled by .geographic area during November to present Infor- 
mation about the I ni^truments, the Interpretation of test scores, and the dis- 
tribution and coMe3ction of materials. The Assistant Commissioner for Research 
and Development teped a 15 minute television program to discuss the testing with 
fourth grade teachers. Channel 2 (Boston) presented this tape on three consecu- 
tive afternoons for the eastern part of the state and on closed circuit television 
for the western half. 

In&tAmt^ and RepoA^ 

The Instruments selected for this program were the Comprehensive Tests of 
Basic Skills (CTBS) and the Short Form Test of Academic Aptitude (SFTAA), 
published by CTB/McGraw-HIII . CTBS provided information on levels of mastery 
for both learning content, j^nd process iareas. When used In relation with SFTAA, 
-CTBS provided* Information on actual or "obtained" achievement compared with 
potential or "anticipated" achievement for Individuals, classes, schools and 
systems. These ''anticipated'' achievement scores were computed for each student 
using multiple regression formulas which utilize certain predictors - age, grade 
In school, sex and raw sco -es for each of the subtests of the California Short- 
Form Test of Academic Aptitude. 

The Cormionwealth supplied thre^^^ different reports of test data for a|{ 
school systems: 

• The "Administrator^^ Summary of Test Data Mean Values" which reported 
subtest and total subtest mean, "obtained" and "anticipated" scores 
for each school as well as for the district as a whole. 

• The "Combination Class Record" which reported "obtained" and "anticipated" 
achievement scores for each class member and summary data for each class. 

• The "Summary Report of the Right Response Record and Item Analysis" which 
presented group Item mastery data for both content and process dimensions. 

Testing was conducted during the second week in January, Post-testing work- 
shops we'-e conducted at the Department of Educetton Regional Centers during the 
last half of March. Resource materials were prepared by Research and Developnnent 
for these workshops to enable local educational agencies to effectively utilize 
test results. At this time It was suggested that teachers focus on the difference 
score which compared the "obtained" and "anticipated" scores plus the national 
percentile rank of the "obtained" achievement score for each student. Through 
use of subtest scores plus tables summariring Item content and process dimensions, 
teachers could Identify Individual student Instructional needs. System level 
evcjiuatlons relative to overall Instructional goals could be made based on the 
Right Response Summary and the Administrator's Summary *clata . 

Aptitude and achievement data were obtained for 324 school systems;, 1488 
schools, and 85,382 children. This impressive data bank provided the basis for 
statistical analyses to answer questions verbalized In the initial- planning 
stages for the program as well as those evolved as a consequence of the testing. 



ERIC 



-3- 



Qualifying statements were necessary to view the results of the testing 
program In the reality of the whole educational process: 

• Different school systems spend different amounts of time on the 
basic ski lis. 

• Measur??;5 were obtained In the basic skills; not In the content fields, 
art and music, or attitudes and values. 

• Results of this testing cannot be considered an evaluation of teaching 
per se, as teaching effectiveness Interacts with Instructional materials 
and support services, administrative and supervisory leadership, parental 
and community support, and cultural contributions of the Immediate 
environment* 

StcuteMude ConcJiii6ionb 

The State data showed that 85,382 fourth graders reflected a mean chrono- 
logical age of 9 years, and a mean total 10 of 106. The State as a whole exceeded 
the national norms. The mean "obtained^* scores (ESS) were significantly higher 
than the "anticipated" scores (AASS) In all areas measured by the tests of basic 
skills. Highest scores were obtained In reading comprehension and lowest arith- 
metic computation. 

Based on CTB/McGraw-HI 1 1 analyses generalizations could be drawn regarding 
student performance at three different 10 levels, 113 and above, 96-104, and 
87 and below. The difference In ''obtaJned" versus "anticipated" scores was 
significantly higher for the below average and above average groups than for the 
average ability group, although all group "obtained" scores were significantly 
higher than "anticipated" scores. 

The "obtained" total battery achievement score was compared with elementary 
Instructional expenditures across districts. The prorated elementary cost figures 
were rank ordered. The top 25%, consisting of systems spending $525 and above an 
Instruction, were deslgn^ited high cost systems. The bottom 25%, consisting of 
systems, spending $425 and below, were designated tow cost systems. The average 
expenditure figure was $384 for the low cost systems and $594 for the high cost 
systems. A T-tost Indicated no significant difference In achievement In basic 
skills between high cost and low cost systems. 

Based on the "Right Response Record and Item Analysis" levels of mastery 
were averaged across all subtests and all districts. The means ranged from 
13% to 8456 of the students answering Items correctly. It was concluded, there- 
fore, that Massachusetts fourth graders are slightly higher than the national 
norms In all areas measured. 

Correlations between the subtest total mean scores by schools are statistically 
significant and very high. Schools that did well on one subtest generally did well 
on all subtests. Consistent with other analyses of test results there was no 
relation between high expenditure Instructional cost and high achievement In the 
basic ski lis. 

IwteApAetatcon AajU 

The following analyses were distributed to superintendents: 

ERIC 



« A stanlnc scale biased cn . difference scores across all districts 
enebled an indlviduai school system to vfew Itself In relation 
to State pcrfcrmanctr. DJ f ference scores were chosen es they 
control for different aptitude levels. The same procedure was 
followed to prepare school stanlne scales, 

e T-tosts woro corrputed to dotermfne the significance of the difference 
scores for each subiast by system. This too, was prepared for each 
school . 



Tha number tnd percentage of over, and under and average achievers for 
esch system wiihin each subtest was computed. > 



ERIC 



© Tho mean mon-tal age for each system was also Included. 
RMofc^ 0^ Tut U^e SuA.\je ii 

A survey was Font to all Hftssachusetts superintendents to ascertain the 

extent of use of fourth grcde testing data. Nln0tyelght(98^) replied that 

the data had beon used. The foliowing table Indicates to whom data were 
repori'ed and the extent of In-rervlca^use: 

REPORTED TO 

Sa-IOOL COMMITTEE 85% 

TEACHERS/PRINCIPAL 92% 

PARENTS 35% 

NEWSPAPERS 7% 

CURRICULWi PLAhMING 52% 
IN-SERVICE WORKSHOPS ^9% 

Other methods of us^^ Included: planning <»nrichment for middle groups, 
studont conferences, groupln.; ror instruction^ comparison with other testing 
results, and counnolor review of underachlevement. 

Ntei-nbers of the Drpar tuent cf Education visited twenty-nine schools across 
Massachusetts v^hlch exhib! l^^d rJjh positive dif ference scores in reading, language, 
and arlthr.etlc. The purpose of the visitations was to Identify factors which 
may be associated with hl^h achievomont in basic skills. These schools represented 
a variety of sQCio-economi c patterns and wore located In Inner-city, suburban and 
rural are^s* 

It vms the cofr^!usIon cf iha vii^Itatlon teams that Instruction In these 
fourth grades was goncTaMy c-erod to the *aslc skills, even In the content 
areos. Dally routines wcro '/el} established providing an economic use of 
{earning tltre: chtldr^^n v^ere av/jre of the limits within which they functioned.. 
A high degree of raspect - rs/scher for pupil, pupil for pupil and pupil- for 
taacher wrs exhibttc^d In most classrooms. 

This ropor-t describes an Initial attempt to assess through testing by 
•**he Massechus^atts Stat© Department of Education. Foture programs based on 
■ Q heightened responslbl Hty both to tho profession and the public, will 
O reflect maasuremant In a greater vartoty of areas* 



